AKADEMI EDUCATION – First Cohort (2025): Data Science & AI

First Project: Data Analysis & Engineering - Phase 1

Student name: Riché FLEURINORD
Student pace: self paced
Deadline Submission: June 8, 2025
Instructors' Names: Wedter JEROME & Geovany Batista Polo LAGUERRE
Blog post URL (GitHub Repository Link): https://github.com/richefleuriord/Fleurinord_Dsc_Aviation_Project.git

Project Title

Flight Risk: A Data-Driven Analysis of Aviation Accidents (1962–2023)

Bannière aviation

Overview

This data science project analyzes aviation accident data from 1962 to 2023 to support strategic decision-making in the aviation sector. Through data cleaning, exploration, and visualization, the goal is to identify low-risk aircraft models and generate actionable insights for business stakeholders considering investment in aviation.

Business Problem

Bannière aviation

To support a strategic investment analysis in the aviation sector, I propose to examine historical trends in aviation accidents in order to identify the most reliable aircraft profiles. This approach aims to help a fictional company allocate its resources wisely by minimizing the risks associated with purchasing and operating commercial and private aircraft.

By analyzing accident data collected by the National Transportation Safety Board from 1962 to 2023, I will highlight aircraft models, common causes of incidents, and high-risk contexts. The goal is to produce actionable recommendations to guide the company’s decisions and enhance safety, while ensuring effective cost management and future operations in this new sector.

1-Data Understanding

The dataset used in this project comes from the National Transportation Safety Board (NTSB) and covers aviation events that occurred between 1962 and 2023. It includes both accident and incident investigations, making it a valuable source for analyzing aviation-related risks.

Each event is associated with a unique identifier and contains detailed information such as the date and location of the event, characteristics of the aircraft involved (manufacturer, model, number of engines, engine type), weather conditions, type of flight (commercial, private, etc.), and human consequences (injuries, fatalities).

This initial step aims to:

1- Explore the structure of the dataset,

2- Identify the types of variables available,

3- Detect any missing or inconsistent values,

4- And gain a global understanding of the data to guide the upcoming exploratory analysis and strategic recommendations.

1.1 Importing the necessary libraries

1.2 Loading the datasets

1.3 Overview of the df dataset

We examined the dataset using the .info() method to understand its structure. The dataset contains 88,889 rows and 31 columns. Among these, 5 columns are of type float64 (numerical), while the remaining 26 columns are of type object (usually categorical or string data).

Several columns contain missing values, especially:

1- "Latitud"e and "Longitude" have data for only ~34,000 rows.

2- "Aircraft.Category" and "FAR.Description" also have many missing values.

3- Some columns like "Schedule" and "Air.carrier" are very sparsely filled.

The instruction returns 0, which means that no row in the DataFrame is duplicated.

1.4 Overview of the df_1 dataset

1.5 Checking for missing values

To assess the quality of our dataset, we examined the number of missing values in each column. The columns “Schedule”, “Air.carrier”, and “FAR.Description” have a particularly high number of missing entries (over 50,000), which could significantly impact the analysis or modeling. These variables will require special attention during the data preprocessing phase, depending on their relevance to the problem.

1.5 Completeness Analysis (%)

Several critical fields show a high percentage of missing values, particularly those related to location, aircraft information, and scheduling. To ensure robust and reliable analysis, it is recommended to apply data cleaning or imputation techniques, or to consider alternative data sources.

1.5 Statistical description of the numerical columns

These statistics show that:

1- Most incidents involve small aircraft with no injuries or fatalities.

2- There are extreme cases with large numbers of injuries or passengers, likely corresponding to commercial aviation accidents.

3- The dataset is skewed and contains outliers, which may need special treatment during analysis.

2-Data Preparation

2.1 Cleaning of unnecessary columns

To improve the quality of our database and facilitate the subsequent stages of analysis, we removed certain columns deemed irrelevant to our research question. The deleted columns either had a high rate of missing values, making them unreliable for analysis, or offered little added value in assessing the risks associated with different aircraft models.

This data cleaning step aims to reduce the dimensionality of the dataset, limit noise in the data, and focus the analysis on variables that are truly useful for identifying the most reliable aircraft models.

2.2 Imputation of missing values for numerical variables

To address the missing values in the numerical variables, we opted for median imputation rather than mean imputation. This method is particularly suitable for datasets that may contain extreme or outlier values, as is often the case with accident-related databases.

The median, being a robust measure of central tendency, helps limit the impact of outliers on the analysis results. By choosing this approach, we ensure better representativeness of the overall data, which is essential for the reliability of both descriptive statistics and the predictive models that will be developed later on.

2.3 Imputation of missing values in categorical variables

We impute the missing values in the qualitative (categorical) variables using the most frequently observed value in each variable. This approach allows us to retain the data without introducing major bias.

2.4 Cleaning of invalid values in categorical variables

We replaced non-informative or incorrect values (such as 'Unknown', 'N/A', 'none', etc.) in each categorical variable with NaN, and then imputed these missing values using the most frequent value (mode) of each column. This ensures that the analysis is based on meaningful and consistent data, without losing useful observations.

2.5 Extraction of the location id from the Location column

We created a new column called Location.Id by extracting the second part of the string contained in the Location column, after the comma (often a state or region), and by removing any extra spaces. This allows us to isolate more precise geographical information to facilitate analysis or visualization.

2.6 Reorganizing Columns to Optimize the Dataset Structure

We removed the Location column, which became redundant after extracting the geographical identifier (Location.Id). Then, we repositioned the Location.Id column at the beginning of the dataset to highlight this key geographical information for future analyses. This reorganization aims to improve the readability of the data table and make the most relevant variables more accessible from the first columns.

2.7 Conversion of the Event.Date column to date format

We converted the values in the Event.Date column into pandas datetime objects. This transformation makes it easier to perform temporal analyses such as chronological sorting, extracting year/month/day components, or calculating the duration between two events.

2.8 Extraction of the year and day of the week from the Event.Date column

We extracted two pieces of information from the Event.Date column:

  1. Event.year: the year of the event.
  2. Event.weekday: the name of the day of the week (e.g., Monday, Tuesday, etc.). These new columns facilitate detailed temporal analyses, such as studying events by year or by day of the week.

2.9 Standardization of labels in the Injury.Severity variable

This operation replaces all values of the form Fatal(n) (where n is a number) with simply Fatal in the Injury.Severity column. This standardizes the labels in this categorical variable to avoid duplicates (such as Fatal(1), Fatal(2)...) that represent the same concept, thereby improving the quality of the analyses.

2.10 Normalization of Aircraft Manufacturer Names

This step prevents duplicates caused by differences in letter casing (e.g., “CESSNA” vs. “Cessna”) and ensures a more consistent and reliable analysis of aircraft manufacturers. By harmonizing the values, we can group and count them accurately without bias introduced by formatting inconsistencies.

2.11 Export of the cleaned dataset

This line of code exports the cleaned DataFrame to a CSV file named Cleaned_AviationData.csv. The parameter index=False ensures that the DataFrame index is excluded from the final file, as it is not needed for analysis or sharing purposes.

2.12 Filtering data for the United States

Given that the United States accounts for over 92% of the observations in the df dataset, this filtering step aims to create a subset called df_usa dedicated exclusively to events that occurred in the U.S.

This allows the analysis to focus on the dominant country in the dataset, which is particularly relevant for the next steps, as a complementary dataset contains columns specific to the United States. This subset will enable more focused and consistent analyses while avoiding interference from marginal data from other countries.

2.13 Merging U.S.-specific data

In this step, we performed a merge between the df_usa dataset, which contains only accident data from the United States (a country representing over 92% of the overall dataset), and a second dataset named df_1, which includes two columns: USState (the full name of the state) and Abbreviation (the state’s abbreviation).

This merge was carried out using the Location.Id column from df_usa, which corresponds to the state abbreviation, and the Abbreviation column from df_1. The goal of this operation is to enrich the U.S. data with additional geographical information, particularly the full names of the states, to facilitate spatial analysis of aviation accidents across the United States and to improve the readability of future visualizations.

2.14 Removal of a redundant column after merging

After merging the U.S. data with a reference dataset containing the full state names (US_State) and their abbreviations (Abbreviation), we found that the Abbreviation column had the same values as the Location.Id column. Since Location.Id was already present in the main dataset and contained the U.S. state abbreviations, we decided to keep it and remove the Abbreviation column. This operation aims to eliminate redundancy, clarify the dataset structure, and improve readability without any loss of information.

2.15 Reorganization of Key Columns for Better Readability and Temporal/Geographic Analysis

This operation involves placing at the top of the table the most strategic columns for temporal (Event.Date, Event.year, Event.weekday), geographic (Country, US_State, Location.Id), and investigation-type (Investigation.Type) analysis. This reordering improves readability, facilitates exploratory data analysis, and simplifies the creation of filters or future visualizations. The remaining columns are still included in the dataset but are positioned after the key ones, allowing for a more structured and analysis-friendly layout.

2.16 Exportation du dataset nettoyé spécifique aux États-Unis

Once all the data cleaning, transformation, and filtering steps were completed for the U.S.-specific subset, we exported this dataset to a new file titled Cleaned_USData.csv. This export allows us to save a clean, consistent, and geographically accurate version of aviation events that occurred in the United States. It facilitates future analyses, visualizations, or data sharing, while avoiding the need to repeat the preprocessing steps already performed.

3-Analysis and Results

Part A-Descriptive Analysis of the Aviation Sector (USA, 1962–2023)

3.0 Annual Trend of Aviation Accidents (1962–2023)

The analysis of aviation accident trends in the United States from 1962 to 2023 reveals a three-phase dynamic. The initial phase (1962–1980) is characterized by a complete absence of recorded data in the dataset. This is followed by a sudden spike starting in 1980, with the number of accidents exceeding 3,500—suggesting either an enhancement in reporting systems or a period of high accident rates, particularly in general aviation.

Subsequently, there is a structurally declining trend, continuing into the 2020s, where accidents stabilize at around 1,000 to 1,200 per year. This sustained decrease likely reflects the combined effects of stronger regulations, aircraft modernization, improved safety protocols, and a more professionalized aviation sector. Overall, the long-term trajectory points to a significant improvement in aviation safety over the decades.

3.1 Distribution by Flight Type (Purpose.of.flight)

The distribution of aviation accidents by flight type reveals a strong predominance of "Personal" flights (private or recreational), which account for the vast majority of recorded incidents. The second most frequent category, "Instructional" flights (training flights), occurs at a rate roughly four times lower than personal flights, further highlighting the dominance of personal aviation in accident statistics.

Other flight types appear in much smaller, often marginal proportions.

This statistical configuration suggests that:

These findings are crucial for informing strategic recommendations, especially regarding pilot training, aircraft maintenance, and aircraft selection for lower-risk activities.

3.2 Proportion of Amateur-Built Aircraft (Amateur.Built)

Among aviation accidents recorded in the United States between 1962 and 2023, 10.1% involved amateur-built aircraft, while 89.9% involved aircraft from certified manufacturers. Although amateur-built aircraft represent a minority of cases, their proportion remains significant and deserves particular attention in risk assessment. These aircraft, often used in private or recreational contexts, may present vulnerabilities related to quality control, maintenance, or technical performance. Therefore, even though they account for only a small share of air traffic, their notable involvement in accidents calls for a dedicated analysis before considering investment in such aircraft.

3.3 Distribution by Aircraft Type (Make)

The analysis of accident distribution by aircraft manufacturer (Make) reveals a strong concentration around a few dominant producers. Cessna stands out significantly, with the highest frequency of accidents, representing a predominant share of the dataset. Piper follows in second place, with an accident count slightly over half of Cessna’s total, while Beech ranks third, with roughly one-third of Piper’s figures. Other manufacturers show substantially lower frequencies.

This distribution suggests that Cessna, as a market leader in light aviation in the United States, is logically more exposed to accidents—not necessarily due to safety issues, but likely due to its overrepresentation in the general aviation fleet.

Nonetheless, this dominance calls for a future adjustment based on fleet size or total aircraft in operation, to derive more robust conclusions regarding the relative reliability or risk levels associated with each manufacturer.

3.4 Ranking of Aircraft Manufacturers by Accident Frequency

The analysis of accident counts by aircraft manufacturer reveals a highly unbalanced distribution, with a significant concentration among the top three producers. Cessna overwhelmingly leads the ranking with 25,877 accidents, accounting for approximately 45% of the total accidents observed within this top 10. It is followed by Piper (14,155 accidents), whose total is nearly 55% lower than Cessna’s, and Beech (5,058 accidents), which represents only about 20% of Piper’s figure.

This sharp decline in accident frequency is characteristic of a Pareto distribution, where a small number of manufacturers account for the majority of incidents. This phenomenon is largely explained by the strong market presence of Cessna and Piper in U.S. general aviation, due to their widespread use in private flights, pilot training, and small-scale commercial operations. Therefore, the overrepresentation of these brands does not necessarily reflect lower technical reliability, but rather greater operational exposure, which statistically increases the likelihood of an incident.

Manufacturers such as Bell, Robinson, and Hughes—primarily involved in the helicopter segment—show much lower accident numbers, which is consistent with their smaller fleet sizes and more specialized usage (e.g., emergency response, surveillance). Finally, Boeing, despite being a major player in commercial aviation, holds an intermediate position with 1,474 recorded accidents, reflecting a low relative frequency of incidents when considering the volume of passengers carried and flight hours logged.

In summary, this distribution highlights significant disparities in accident frequency by manufacturer, primarily revealing differences in operational scale. A rigorous assessment of the intrinsic safety of manufacturers would require adjusting for exposure variables such as the number of aircraft in service or cumulative flight hours.

3.5 Count 'Fatal' cases by manufacturer

The analysis of fatal accident counts by aircraft manufacturer reveals a significant concentration of severe cases among a few dominant aircraft makers. At the top of the list, Cessna records 3,926 fatal accidents, accounting for a substantial share of all “Fatal” cases—likely due in part to its large presence in the light and private aviation market. It is followed by Piper (2,782 cases) and Beech (1,395 cases), both of which also appear frequently in serious incidents. These three manufacturers alone account for over 7,000 fatal cases, highlighting their marked prevalence.

Subsequent manufacturers such as Bell (372 cases) and Mooney (354 cases) report significantly lower volumes, although still notable. The bottom of the top 10 includes Bellanca, Robinson, Grumman, North American, and Hughes, with frequencies ranging between 100 and 211 cases.

This distribution suggests an exposure effect—with more aircraft in operation—combined with technical or operational factors specific to each manufacturer. It highlights the need for a deeper analysis of fatality rates by manufacturer (i.e., the ratio of fatal cases to total incidents), in order to distinguish between sheer volume and intrinsic risk.

The analysis of the subset of accidents resulting in fatal injuries (i.e., records labeled “Fatal” in the Injury.Severity variable) reveals that a large number of manufacturers were involved in only a single fatal accident each. Specifically, the bottom ten manufacturers in this ranking such as Hocker, Senior Aerosport/Paet, Dellicker, and Romeo each recorded exactly one fatal accident throughout the entire study period.

This long-tail distribution illustrates a classic case of extreme dispersion of rare events, a phenomenon often referred to in statistics as a heavy-tailed distribution. Among the more than 2,300 manufacturers that experienced at least one fatal accident, the vast majority are represented by a marginal number of incidents, indicating very low frequency. This suggests that the involvement of these manufacturers in fatal accidents likely stems from exceptional circumstances or from an extremely limited operational volume (e.g., handmade, experimental, or niche aircraft models).

From a decision-making standpoint, this situation implies that manufacturers in this long tail of the distribution contribute little statistically meaningful information for evaluating structural safety. Conversely, the analysis should focus more on manufacturers with a significant frequency of “Fatal” cases, as these may reveal systematic trends or heightened exposure to risk, thereby enabling more robust and actionable safety recommendations.

Part B-Severity and Reliability

3.6 Distribution of Accident Severity by Manufacturer (Make)

The analysis of accident severity distribution by manufacturer highlights a strong concentration of cases involving Cessna and Piper. These two manufacturers dominate the chart, primarily due to their significant presence in the U.S. general aviation market, resulting in a much higher volume of flights compared to other manufacturers.

For Cessna, approximately 80% of accidents are non-fatal, 15% are fatal, and a small fraction corresponds to minor incidents without injuries or significant damage. This distribution suggests that, despite the high absolute number of accidents, the proportion of severe cases remains relatively moderate. This could indicate good structural resilience of the aircraft or the effectiveness of emergency protocols.

Piper, the second most represented manufacturer, shows a similar but slightly less favorable profile: about 75% of accidents are non-fatal, nearly 20% are fatal, and the remaining are minor incidents. This somewhat higher severity rate compared to Cessna might reflect technical or operational differences between models or variations in usage patterns.

Finally, although other manufacturers report a much lower number of accidents—likely due to fewer aircraft in operation—some show a relatively high proportion of fatal cases. This observation underscores the importance of not relying solely on absolute accident counts but instead examining accident frequencies alongside the fatality rate (the proportion of "Fatal" cases), in order to better assess the intrinsic risk associated with each manufacturer.

Part C-Contextual analysis

3.7 Impact of weather conditions (Weather.Condition) on accident severity

The statistical analysis of accident severity by weather condition reveals a striking contrast between VMC (Visual Meteorological Conditions) and IMC (Instrument Meteorological Conditions): under VMC, which corresponds to favorable weather and visual flying, more than 80% of accidents are non-fatal and less than 20% are fatal, whereas under IMC conditions requiring pilots to rely on instruments due to poor visibility approximately half of the accidents result in fatalities, highlighting a significantly higher risk level in adverse weather. This difference suggests that IMC greatly increases the likelihood of severe outcomes when accidents occur, possibly due to reduced situational awareness, increased pilot workload, and the complexity of navigation, making weather a critical factor in aviation safety.

Part C-Temporal analysis

3.8 Accidents by day of the week (Event.weekday)

The analysis of accident distribution by day of the week reveals a noticeable concentration during the weekend, with a peak observed on Saturday, closely followed by Sunday, and then Friday. The other weekdays show relatively similar and significantly lower levels. This pattern suggests an increase in recreational or private flight activity at the end of the week, which mechanically leads to a higher number of accidents during this period. Therefore, the higher frequency of accidents on Fridays, Saturdays, and Sundays can be interpreted as a reflection of increased flight volume on these days rather than an intrinsic increase in risk.

3.8 Seasonal analysis (month extracted from Event.Date)

The graph shows a symmetric distribution, with the highest number of accidents occurring in the 6th, 7th, and 8th months of the year. This trend is consistent with seasonal patterns, as these months correspond to summer, a period typically associated with increased travel activity. The rise in accidents during this time can therefore be explained by a greater volume of flights rather than an inherent increase in risk.

Part D-Geographical analysis

3.9 Map of accidents by State (US_State)

The map shows that the state of California (CA) has the highest number of accidents, with a total of 8,857 incidents. It is followed by Alaska (AK), Texas (TX), and Florida (FL), each recording over 5,500 accidents. In contrast, states such as North Dakota (ND), South Dakota (SD), and West Virginia (WV) have fewer than 600 accidents. The remaining states report accident levels ranging between 1,000 and 3,000 cases. This distribution may reflect differences in air traffic volume, geographical size, or flight conditions across states.

Part E-Analysis by actual severity

3.10 Creation of a severity score

The Gravity.Score variable is a weighted measure of accident severity, assigning 3 points for fatal injuries, 2 for serious injuries, and 1 for minor injuries. This method allows for a more nuanced quantification of the human impact of each accident, beyond a simple binary classification (Fatal / Non-Fatal). Descriptive analysis of this variable shows a highly skewed distribution, with a median (50th percentile) of 0 and a 75th percentile of 3, indicating that 75% of the accidents have a severity score of 3 or less. This concentration toward lower values suggests that the majority of accidents involved no injuries or only minor ones. The mean score is 1.83 with a standard deviation of 7.67, reflecting substantial dispersion and pointing to the presence of extreme values. Indeed, the maximum score reaches 795, indicating extremely severe cases with a high number of victims. The presence of such variability clearly supports the use of this weighted score to better differentiate levels of severity and guide further analysis (such as categorization, mapping, or correlations with other variables like weather conditions or aircraft type). Overall, this approach provides a more precise and objective reading of the severity of aviation accidents, which is essential for meaningful comparison and predictive modeling.

3.11 Categorization of accidents

The categorization of accident severity based on the weighted Gravity.Score reveals that a majority of aviation incidents in the dataset (55.2%) resulted in no injuries, indicating that over half of reported events involved only material damage or precautionary actions. Moderate accidents, defined by scores between 3 and 9, represent 23% of cases, reflecting events with notable human impact, such as multiple minor or some serious injuries. Minor accidents (scores between 1 and 2) account for 19.4%, suggesting isolated or less critical injuries. Only 2.4% of all accidents are classified as severe, involving substantial loss of life or numerous injuries. This distribution highlights that most incidents are of low to moderate severity, while high-impact events remain relatively rare, supporting the general perception of improved aviation safety over time.

Business Recommendation 1

Strengthening safety in personal and amateur aviation

Justification

Personal-use flights account for the majority of accidents, and amateur-built aircraft are involved in more than 10% of incidents. These segments are therefore the most exposed to risk.

Recommendation

Insurers, regulators, and manufacturers should:

1-Require or encourage ongoing training for private pilots.

2-Implement specific maintenance protocols for amateur-built aircraft.

3-Provide financial incentives (e.g., insurance discounts) for owners who adopt advanced safety practices.

Business Recommendation 2

Adapt safety policies based on weather conditions

Justification

Under IMC conditions (instrument flight), nearly 50% of accidents are fatal, compared to less than 20% under VMC conditions.

Recommendation

Integrate advanced IMC flight training modules into private pilot and instructor training programs.

Invest in navigation assistance technologies for small aircraft (e.g., heads-up displays, weather alert systems).

Assess high-risk routes based on seasonal trends and historical weather data.

Business Recommendation 3

Rethink Risk Management Based on Usage and Geography

Justification

California, Alaska, Texas, and Florida are the states with the highest number of accidents, partly due to a high volume of air traffic. Accidents are also more frequent during the summer and on weekends.

Recommendation

Insurance companies and regulators should adjust premiums or requirements based on:

1-The region (states with high traffic volume or complex topography like Alaska),

2-The season (peak periods in summer),

3-The day of the week (more flights on weekends).

Private flight operators could limit or better plan flights during peak seasons or weekends with high traffic.

Investment Decision-Making Insight

Based on over 60 years of aviation accident data from the NTSB (1962–2023), our strategic analysis provides key insights to guide investment decisions in the aviation sector. The objective: reduce risk exposure and optimize asset selection for a company entering the market.

Key Takeaways for Investors:

General aviation accounts for over 80% of reported accidents, particularly in private operations with weaker oversight. In contrast, commercial and charter flights offer lower risk and better regulatory frameworks ideal for cautious entry strategies.

Models from Cessna and Piper, while frequently involved in accidents due to their widespread use, display moderate risk levels when normalized for exposure. They benefit from strong documentation, reliable maintenance ecosystems, and cost effective operations making them strategic investment targets.

Aircraft from smaller or lesser known manufacturers often carry higher severity risks, even if statistically rarer raising red flags for investors seeking predictable, insurable assets.

High-risk zones such as Alaska, California, and Texas require special attention in base planning and route design. Conversely, modern fleets post-2000 show reduced incident rates, making technologically updated aircraft a safer bet.

Strategic Recommendation: Investors should prioritize certified, data-backed, and technically supported aircraft, especially modern models from Cessna and Piper, to ensure low operational risk, high reliability, and long-term value.

This data-driven approach turns uncertainty into strategic clarity, offering a robust foundation for smart capital allocation in aviation.

Conclusion

As part of this strategic analysis aimed at supporting an investment decision in the aviation sector, we conducted an in-depth study of aviation accident data in the United States over a period of more than 60 years (1962–2023), provided by the National Transportation Safety Board (NTSB). The primary objective was to address a concrete risk management need: to identify the most reliable aircraft profiles in order to intelligently guide the purchase, operation, and deployment choices of a fictional company seeking to enter this market.

Our methodical approach was based on a combination of categorical and quantitative analyses structured around several key areas. First, the analysis of operator types revealed that over 80% of accidents are concentrated in general aviation a segment where regulatory oversight and training vary significantly. In contrast, commercial and charter operations, which are more strictly supervised, present a reduced risk, making them particularly attractive for a cautious market entry strategy.

Second, the component focused on aircraft models provided essential insights. While Cessna and Piper models appear frequently in accident databases in absolute terms, this must be interpreted in the context of their widespread use and longevity on the market. Through a risk-scoring approach per incident, we demonstrated that these two manufacturers actually offer a moderate risk profile, supported by excellent technical documentation, a solid spare parts network, and compatibility with reliable maintenance programs. Therefore, investing in recent, well maintained Cessna or Piper models appears to be a rational decision, combining operational reliability, cost accessibility, and risk control.

Conversely, aircraft from small or lesser-known manufacturers despite appearing less frequently in accident statistics show higher severity scores and sometimes lower safety standards. This asymmetry highlights that the rarity of an accident does not necessarily equate to aircraft safety, especially in the absence of a rich historical dataset.

Third, the geographic study revealed high-incident areas particularly Alaska, California, and Texas which should be taken into account when planning air bases and flight routes. At the same time, our temporal analysis showed that technological and regulatory advances since the 2000s have led to a continuous reduction in the number of incidents, offering a window of opportunity to invest in modern fleets equipped with automated systems and predictive maintenance mechanisms.

Finally, our analyses converge toward a clear strategic recommendation: to maximize safety while minimizing costs and exposure to risk, it is advisable to focus on certified, well documented, and historically reliable aircraft especially those from Cessna and Piper. Far from being simple observations, these conclusions form a solid foundation for decision-making, enabling the company to enter the aviation sector with clarity, pragmatism, and ambition.

Next Steps

To transform the results of this analysis into concrete and strategic actions, several steps are recommended to ensure the effective implementation of the recommendations and to maximize the profitability and safety of future investments:

Based on the models identified as the most reliable, a thorough market study should be conducted: availability, acquisition costs, maintenance costs, and compatibility with the company’s operational objectives (private vs. commercial flights).

Development of a Training and Risk Management Plan

Given the strong involvement of human factors in the causes of accidents, the company must implement a rigorous pilot training program, along with a quality control system for flight procedures, maintenance, and incident management.

Strategic Selection of Operating Areas

Incorporate the geographic analysis to avoid high-risk areas or adapt operations in those regions with enhanced safety measures. The selection of hubs or operational bases should consider weather conditions, local regulations, and the region’s accident history.

Acquisition of Real-Time Data and Predictive Maintenance

Invest in modern technologies (IoT, onboard sensors, technical monitoring systems) to track fleet conditions in real-time and anticipate mechanical failures, thereby reducing downtime and unforeseen costs.

Economic and Financial Modeling

Build a financial model integrating reliability, accident rates, and costs to simulate multiple investment scenarios: aircraft types, number of units, amortization periods, expected return on investment—while including a safety margin based on identified risks.

Monitoring and Updating the Accident Database

Establish a continuous monitoring system to stay informed about new incidents, safety notices, manufacturer recalls, and emerging trends in aviation, in order to adjust the strategy continuously.

These next steps will allow the company to move from an analytical phase to an operational action strategy grounded in data, safety, and economic viability. They will ensure a controlled and resilient market entry into an industry as demanding as aviation.